Skip to content

[Enhancement](udf) Support volatility property for scalar UDF#62698

Open
linrrzqqq wants to merge 3 commits into
apache:masterfrom
linrrzqqq:udf-nondeterministic
Open

[Enhancement](udf) Support volatility property for scalar UDF#62698
linrrzqqq wants to merge 3 commits into
apache:masterfrom
linrrzqqq:udf-nondeterministic

Conversation

@linrrzqqq
Copy link
Copy Markdown
Collaborator

@linrrzqqq linrrzqqq commented Apr 22, 2026

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Previously, UDFs could be treated as deterministic in optimizer-related paths, which is unsafe for UDFs whose results are not stable across evaluations. That may cause invalid rewrite/planning decisions and lead to incorrect query semantics in some cases.

Introduce immutable, stable, and volatile semantics through "volatility" = "immutable|stable|volatile", persist the property in function metadata, and use it to drive deterministic and volatile-expression behavior in Nereids.

Immutable UDFs are treated as deterministic, stable UDFs avoid volatile identity handling while remaining non-deterministic, and volatile UDFs receive per-call volatile identities to protect unsafe rewrites.

CREATE TABLE cte_uuid_seed (id INT) ENGINE=OLAP DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1 PROPERTIES ("replication_num" = "1");
INSERT INTO cte_uuid_seed VALUES (1),(2),(3);

DROP FUNCTION IF EXISTS py_uuid_token(INT);
CREATE FUNCTION py_uuid_token(INT)
RETURNS STRING
PROPERTIES (
    "type" = "PYTHON_UDF",
    "symbol" = "py_uuid_token_impl",
    "always_nullable" = "false",
    "runtime_version" = "3.12.11"
)
AS $$
import uuid
def py_uuid_token_impl(x):
    return f"{x}-{uuid.uuid4()}"
$$;

before:

SET enable_cte_materialize = true;
SET inline_cte_referenced_threshold = 10;

-- treated as volatile func(UniqueFunction), which caused wrong planning
WITH cte AS (SELECT id, py_uuid_token(id) AS token FROM cte_uuid_seed)
SELECT id, COUNT(DISTINCT token) AS distinct_tokens
FROM (SELECT id, token FROM cte UNION ALL SELECT id, token FROM cte) u
GROUP BY id ORDER BY id;
+------+-----------------+
| id   | distinct_tokens |
+------+-----------------+
|    1 |               2 |
|    2 |               2 |
|    3 |               2 |
+------+-----------------+

now

+------+-----------------+
| id   | distinct_tokens |
+------+-----------------+
|    1 |               1 |
|    2 |               1 |
|    3 |               1 |
+------+-----------------+

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Apr 22, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

OpenCode automated review failed and did not complete.

Error: Review step was skipped (possibly timeout or cancelled)
Workflow run: https://github.com/apache/doris/actions/runs/24826801894

Please inspect the workflow logs and rerun the review after the underlying issue is resolved.

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/6) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (6/6) 🎉
Increment coverage report
Complete coverage report

@linrrzqqq linrrzqqq force-pushed the udf-nondeterministic branch from f2edf43 to 3678922 Compare April 23, 2026 11:25
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found blocking correctness issues.

  1. Blanket isDeterministic() == false for all Java/Python UDF/UDAF/UDTF classes is too broad. The modified regression coverage uses deterministic helpers (IntTest, MySumInt, FloatTest, float_test.py), so this change now rejects pure external UDFs from MV/MTMV unless users opt into enable_nondeterministic_function=true and stops existing MV rewrite coverage from applying.

  2. The optimizer fix is incomplete. Several rewrite paths that can duplicate or relocate expressions still key off containsUniqueFunction() rather than !isDeterministic(), so external UDFs can still be evaluated multiple times or moved incorrectly. A simple filter(project(udf(...) as a)) case is still vulnerable.

  3. The bundled BE fix correctly handles unaligned Decimal256 Arrow reads, but the adjacent TYPE_DECIMALV2 Arrow branch still performs the same unsafe reinterpret_cast load and keeps the misalignment UB for DECIMALV2.

Critical checkpoints:

  • Goal / correctness: The PR fixes the specific CTE nondeterminism example, but not the end-to-end optimizer correctness problem and it broadens semantics for all external UDFs.
  • Scope / minimality: Not minimal; it changes default behavior for every Java/Python UDF/UDAF/UDTF and rewrites existing deterministic-UDF MV tests around that change.
  • Concurrency / lifecycle: No new concurrency or lifecycle issue found in the touched code.
  • Config / compatibility: No new config or compatibility issue found.
  • Parallel paths: Not all relevant rewrite paths were updated.
  • Tests: Added CTE/MTMV coverage is useful, but the modified float/java tests now encode the broader regression for deterministic UDFs and there is no regression case for the remaining filter/project duplication path.
  • Observability / transaction / FE-BE variable passing: Not applicable in this diff.
  • Performance: Blanket nondeterminism disables valid optimizations for pure UDFs.

No additional user-provided focus was supplied.

Comment thread be/src/core/data_type_serde/data_type_decimal_serde.cpp Outdated
@linrrzqqq linrrzqqq force-pushed the udf-nondeterministic branch from 3678922 to f66187b Compare April 23, 2026 13:15
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (6/6) 🎉
Increment coverage report
Complete coverage report

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional blockers beyond the existing inline threads:

  1. Pre-upgrade UDF metadata is not backward-compatible with the new persisted deterministic field. Old Java/Python UDFs/UDAFs/UDTFs replay as nondeterministic after upgrade, which silently changes SHOW CREATE FUNCTION output and can strand existing MTMVs in SCHEMA_CHANGE when ensureMTMVQueryUsable() re-analyzes the stored query.
  2. The new per-UDF deterministic property is still not honored in cache paths. Deterministic external UDFs remain blanket-ineligible for SQL cache and fragment query cache because those paths still special-case Udf/AliasFunction instead of consulting isDeterministic().

Critical checkpoints:

  • Goal: Partially met. The PR plumbs a user-facing determinism flag through create/show/Nereids translation and adds regression coverage for CTE/MTMV scenarios, but it does not preserve compatibility for existing persisted UDFs and it does not propagate the flag to cache consumers.
  • Scope/focus: The code changes are localized, but the end-to-end contract of the new property is incomplete.
  • Concurrency: No new lock-order or thread-safety issue found in the touched code; the blocking problems are replay/analysis-path correctness issues.
  • Lifecycle/static init: No special lifecycle or static initialization issue found.
  • Config/properties: A new user-visible UDF property (deterministic) is added and parsed, but not all relevant consumers honor it.
  • Compatibility/persistence: Blocking issue. Function JSON persistence changed without a backward-compatible upgrade path for existing function metadata.
  • Parallel code paths: Blocking issue. Cache/query-normalization paths still bypass the new flag.
  • Special conditional checks: No additional issue beyond the above.
  • Test coverage: Added DDL/CTE/MV coverage is useful, but there is no coverage for upgrade/replay compatibility or SQL/query cache behavior.
  • Test result files: The new .out files look consistent with the added regression cases.
  • Observability: No new observability gap found for this change.
  • Transaction/persistence/data writes: Metadata persistence is affected; the replay compatibility issue above is the main blocker.
  • FE-BE variable passing: Not applicable here.
  • Performance: No material performance regression found in the touched code.
  • Other issues: No additional distinct blocker beyond the two comments here and the already-open review threads.

User focus: none provided.

Comment thread fe/fe-catalog/src/main/java/org/apache/doris/catalog/Function.java Outdated
@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 87.76% (43/49) 🎉
Increment coverage report
Complete coverage report

@linrrzqqq linrrzqqq force-pushed the udf-nondeterministic branch from f66187b to 9d73507 Compare April 24, 2026 02:22
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@linrrzqqq linrrzqqq changed the title [Fix](udf) mark udf nondeterministic [Enhancement](udf) support deterministic property for udf Apr 24, 2026
@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 20.41% (10/49) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 66.15% (43/65) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 42.57% (43/101) 🎉
Increment coverage report
Complete coverage report

@HappenLee
Copy link
Copy Markdown
Contributor

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am requesting changes based on the already-open inline review threads; I did not add duplicate inline comments for the same issues.

Critical checkpoint conclusions:

  • Goal/test coverage: The PR aims to add an explicit deterministic property for Java/Python UDF, UDAF, and UDTF handling. The property is plumbed through catalog/Nereids classes and new tests exist, but existing review threads identify unresolved end-to-end gaps in cache/planner behavior and optimizer rewrites.
  • Scope/focus: The code changes are mostly localized, but the behavior change is broad because UDF determinism affects MV/MTMV eligibility, SQL cache, fragment query cache, CTE inlining, and optimizer rewrites.
  • Concurrency/lifecycle: I did not find new lock ordering, shared mutable state, or static initialization risks in this diff.
  • Configuration: No new config item is added; this is a persisted function property.
  • Compatibility/persistence: Existing review context already flags that old persisted functions lack the new field and may be replayed with a changed default, which is a blocking compatibility concern.
  • Parallel paths: Java/Python scalar, aggregate, and table-function wrappers are touched, but existing review context already flags remaining parallel paths that still do not respect the new determinism bit.
  • Conditional checks/error handling: Boolean property parsing follows existing property parsing style; I did not find an additional distinct error-handling issue.
  • Tests/results: Regression and FE unit coverage were added, with generated .out files present. Coverage still does not close the existing end-to-end gaps noted in the inline threads.
  • Observability/transaction/data-write/FE-BE protocol: No new observability need, data write path, or FE-BE protocol field was identified beyond persisted function metadata.
  • Performance: I did not find an additional distinct performance regression in the changed code.

User focus: No additional user-provided review focus was specified. I reviewed the full PR with extra attention to determinism propagation and did not find a new non-duplicate issue beyond the existing review threads.

@yujun777
Copy link
Copy Markdown
Contributor

yujun777 commented May 8, 2026

I think a boolean deterministic property may still be too coarse here. For UDFs, it may be better to consider the same three categories as function volatility:

  • immutable: same input always returns the same output, e.g. def f(x): return x + 1 or lower/abs-like pure computation.
  • stable: stable within one statement/query but may change across statements, e.g. a UDF returning the query start time, session/database context, or a statement-level config value.
  • volatile: each invocation may return a different result or the call count/location has semantics, e.g. uuid.uuid4(), random.random(), HTTP/RPC calls, or UDFs with side effects.

So for Python UDF we probably need to identify which of these cases we want to support. The current deterministic true/false split can distinguish immutable from non-immutable, but it cannot distinguish stable functions from volatile functions, while optimizer rules need different behavior for those two cases.

@linrrzqqq linrrzqqq force-pushed the udf-nondeterministic branch from 9d73507 to 2bbd937 Compare May 12, 2026 08:06
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. I am not adding duplicate inline comments because the remaining blocking concern is already covered by the existing thread on CreateFunctionCommand: the new UDF volatility flag is still not effective end-to-end for SQL cache / fragment query cache eligibility. In the current head, ExpressionAnalyzer still marks any Udf/AliasUdfBuilder as cannotProcessExpression, and PhysicalPlanTranslator.hasUndeterministicExpression() still returns true for any Udf.class/AliasFunction.class before relying on isDeterministic(), so a scalar UDF declared immutable/volatility=immutable is still treated as uncacheable in those paths.

Critical checkpoint conclusions:

  • Goal/test: The PR aims to let users classify scalar Java/Python UDF volatility and protect optimizer rewrites for volatile calls. The current tests cover construction/equality and some rewrite behavior, but they do not prove the cache-related advertised deterministic behavior end-to-end.
  • Scope/focus: The implementation is mostly focused on Nereids UDF volatility plumbing, though several end-to-end paths still need alignment.
  • Concurrency/lifecycle: No new concurrent mutable state or non-obvious lifecycle issue found in the reviewed FE changes.
  • Configuration/compatibility: No new config item. Persisted Function compatibility now has a null/default guard for old metadata.
  • Parallel paths: One important parallel path remains incomplete: SQL cache and fragment query cache still use UDF-class checks that bypass the new volatility semantics.
  • Special conditions: The VOLATILE identity logic is localized and guarded; no additional distinct issue found there.
  • Test coverage: Missing cache eligibility tests for immutable UDFs and negative tests for volatile UDFs through the cache decision paths.
  • Observability/transactions/data writes: Not applicable for this FE optimizer/catalog metadata change beyond normal function persistence.
  • FE/BE variable passing: Runtime execution does not appear to require BE volatility propagation; this is FE planning metadata.

User focus: No additional user-provided review focus was specified.

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29761 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 39d521e7b196f60e1af3bf8b3e0dd0c451ee7317, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17792	3936	3914	3914
q2	q3	10731	864	610	610
q4	4661	463	354	354
q5	7478	1315	1131	1131
q6	190	172	139	139
q7	896	941	765	765
q8	9293	1395	1272	1272
q9	5681	5814	5341	5341
q10	6292	2076	1817	1817
q11	490	273	272	272
q12	692	419	295	295
q13	18248	3308	2753	2753
q14	291	287	262	262
q15	q16	904	876	795	795
q17	929	1048	795	795
q18	6481	5687	5587	5587
q19	1178	1283	1128	1128
q20	509	403	264	264
q21	4755	2344	1934	1934
q22	481	393	333	333
Total cold run time: 97972 ms
Total hot run time: 29761 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4665	4657	4627	4627
q2	q3	4684	4795	4186	4186
q4	2164	2181	1388	1388
q5	4994	5045	5323	5045
q6	186	168	136	136
q7	2114	1884	1632	1632
q8	3340	3105	3134	3105
q9	8414	8475	8385	8385
q10	4490	4573	4285	4285
q11	616	459	393	393
q12	685	761	521	521
q13	3332	3656	3003	3003
q14	306	309	275	275
q15	q16	774	807	706	706
q17	1405	1389	1265	1265
q18	7994	7107	6951	6951
q19	1190	1184	1180	1180
q20	2310	2250	1964	1964
q21	6171	5431	4837	4837
q22	516	473	393	393
Total cold run time: 60350 ms
Total hot run time: 54277 ms

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one blocking issue: the PR commits local regression-test configuration changes that alter the default FE ports, failure limit, and Python UDF runtime. This will break standard regression-test execution for other developers and CI environments that use the repository defaults.

Critical checkpoint conclusions:

  • Goal and tests: the UDF volatility goal is partially covered by FE unit tests and a Python UDF regression test, but the committed default regression configuration prevents normal verification from being reliable.
  • Scope/focus: the volatility code is focused, but regression-test/conf/regression-conf.groovy contains unrelated environment-specific changes and should not be part of the PR.
  • Concurrency/lifecycle: no new FE locking or lifecycle hazard found in the reviewed volatility paths.
  • Configuration/persistence/compatibility: the volatility field is persisted and existing compatibility concerns are already covered in prior threads; the new issue here is the committed test-runner configuration change.
  • Parallel paths: Java/Python scalar UDF paths were reviewed; existing cache and rewrite-path concerns are already raised in prior review threads and not duplicated here.
  • Tests/results: regression coverage was added, but the default regression config change must be reverted so tests run against standard ports and do not stop after one failure.
  • Observability/performance/transactions: no additional issue found in these areas for this PR.

User focus: no additional user-provided review focus was specified.

Comment thread regression-test/conf/regression-conf.groovy
@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170243 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 39d521e7b196f60e1af3bf8b3e0dd0c451ee7317, data reload: false

query5	4318	666	521	521
query6	332	220	198	198
query7	4320	602	311	311
query8	328	234	236	234
query9	8853	4021	3996	3996
query10	440	359	303	303
query11	5831	2396	2210	2210
query12	192	128	132	128
query13	1303	613	434	434
query14	6117	5388	5092	5092
query14_1	4418	4368	4364	4364
query15	206	204	185	185
query16	1015	429	424	424
query17	1129	779	617	617
query18	2649	481	354	354
query19	224	194	156	156
query20	142	131	129	129
query21	221	138	113	113
query22	13625	13678	13385	13385
query23	17130	16410	16032	16032
query23_1	16175	16168	16186	16168
query24	7391	1753	1364	1364
query24_1	1359	1355	1358	1355
query25	565	492	487	487
query26	1337	303	173	173
query27	2761	577	330	330
query28	4434	1991	1958	1958
query29	1013	660	534	534
query30	314	247	198	198
query31	1175	1072	945	945
query32	88	75	74	74
query33	552	363	296	296
query34	1209	1149	670	670
query35	775	785	657	657
query36	1370	1378	1206	1206
query37	154	114	91	91
query38	3241	3124	3083	3083
query39	928	921	911	911
query39_1	879	896	869	869
query40	236	154	134	134
query41	65	64	61	61
query42	110	115	107	107
query43	336	327	281	281
query44	
query45	212	203	196	196
query46	1130	1190	724	724
query47	2285	2301	2182	2182
query48	373	419	301	301
query49	647	530	436	436
query50	708	289	222	222
query51	4270	4265	4224	4224
query52	104	107	97	97
query53	261	281	214	214
query54	315	275	262	262
query55	97	94	89	89
query56	317	328	334	328
query57	1421	1393	1322	1322
query58	305	285	284	284
query59	1563	1598	1408	1408
query60	357	352	341	341
query61	196	179	194	179
query62	692	622	568	568
query63	242	202	211	202
query64	2433	893	726	726
query65	
query66	1722	537	415	415
query67	30086	29863	29862	29862
query68	
query69	484	349	311	311
query70	974	1018	960	960
query71	314	278	268	268
query72	3151	2811	2400	2400
query73	822	739	419	419
query74	5061	4896	4766	4766
query75	2769	2650	2322	2322
query76	2287	1130	782	782
query77	424	421	359	359
query78	12957	12937	12301	12301
query79	1521	999	738	738
query80	1309	577	479	479
query81	539	275	242	242
query82	943	162	124	124
query83	352	281	253	253
query84	268	149	117	117
query85	909	536	453	453
query86	449	342	326	326
query87	3496	3396	3243	3243
query88	3643	2703	2681	2681
query89	455	385	340	340
query90	1889	181	180	180
query91	181	173	145	145
query92	82	79	76	76
query93	1204	964	567	567
query94	729	345	292	292
query95	690	395	433	395
query96	1007	735	346	346
query97	2717	2697	2535	2535
query98	257	230	238	230
query99	1160	1168	982	982
Total cold run time: 254860 ms
Total hot run time: 170243 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 71.52% (118/165) 🎉
Increment coverage report
Complete coverage report

@linrrzqqq linrrzqqq force-pushed the udf-nondeterministic branch from 39d521e to 9b804ab Compare May 14, 2026 02:00
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one additional blocking issue beyond the existing review threads. The PR goal is to add user-controlled UDF volatility/determinism and prevent unsafe rewrites for volatile UDF calls; the implementation adds catalog persistence, Nereids UDF expression identity, optimizer guards, and regression/unit coverage.

Critical checkpoint conclusions:

  • Goal and tests: partially covered by new FE unit tests and a Python UDF regression test, but SHOW CREATE FUNCTION for Python UDFs is not replayable after the new serialization branch. Existing threads also still cover broader end-to-end determinism/cache/rewrite concerns.
  • Scope/focus: the main code changes are focused on FE/Nereids UDF volatility; no BE/Cloud data path changes in the actual GitHub PR diff.
  • Concurrency/lifecycle: no new concurrent state or lock lifecycle is introduced. Volatile identity is per-expression and statement-scoped.
  • Configuration: no new runtime config items.
  • Compatibility/persistence: the new persisted vol field now has an IMMUTABLE fallback for missing old values, but replayability of SHOW CREATE for Python UDFs is incomplete.
  • Parallel paths: Java/Python scalar paths are handled; aggregate/table function volatility is explicitly rejected. Existing comments already cover remaining cache/rewrite paths that still treat UDFs specially.
  • Tests: new coverage exercises volatility behavior, but it does not validate SHOW CREATE FUNCTION output can be executed for Python UDFs.
  • Observability/transactions/data writes: not applicable beyond metadata persistence; no additional logging/metrics needed for this FE-only feature.
  • User focus: no additional user-provided review focus was supplied.

Please fix the inline issue and address the already-open review threads before merge.

Comment thread fe/fe-core/src/main/java/org/apache/doris/catalog/FunctionToSqlConverter.java Outdated
@linrrzqqq linrrzqqq changed the title [Enhancement](udf) support deterministic property for udf [Enhancement](udf) Support volatility property for scalar UDF May 14, 2026
@linrrzqqq linrrzqqq force-pushed the udf-nondeterministic branch from 9b804ab to 2987c8e Compare May 14, 2026 06:54
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found additional issues beyond the existing review threads. The PR goal is clear (persist and honor scalar UDF volatility), and the new scalar create/show/replay coverage addresses part of that goal, but the implementation still has correctness gaps in volatile identity equality and in paths that also handle UDTFs or slot-dependent volatile UDFs.

Critical checkpoint conclusions: goal is only partially met; scope is mostly focused but touches shared rewrite utilities where old assumptions no longer hold; no new FE locking/concurrency issue found; no special lifecycle/static initialization issue found; persistence compatibility is improved by defaulting missing volatility to IMMUTABLE, but UDTF replay/properties are inconsistent; parallel rewrite paths were broadly updated from unique to volatile checks, but the alias-hoisting rule still assumes no input slots; tests cover scalar Python volatility and replay, but miss UDTF show/replay and join/slot-dependent volatile alias cases; no additional observability requirement identified; no transaction/data-write path impact identified; no FE-BE protocol change identified. User focus: no additional user-provided focus points were present.

Comment thread fe/fe-core/src/main/java/org/apache/doris/catalog/FunctionToSqlConverter.java Outdated
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found current-head issues that need to be fixed before merge.

Checkpoint conclusions:

  • Goal/test proof: The volatility plumbing is partially covered, but one newly added unit test is inconsistent with the current implementation and will fail.
  • Scope/focus: The main changes are focused on UDF volatility, but there is stale code left from the previous aliasing attempt.
  • Concurrency/lifecycle/config: No new concurrency, lifecycle, or config issues found in the reviewed volatility path.
  • Compatibility/persistence: Existing review threads already cover serialization/replay compatibility concerns; I did not add duplicate comments.
  • Parallel code paths: Existing review threads already cover UDTF SHOW/SHOW CREATE handling; no additional distinct instance found.
  • Test coverage/results: The new UdfVolatilityTest contains a failing assertion against current AddProjectForUniqueFunction behavior.
  • Observability/transactions/data writes: Not applicable for this PR area.
  • User focus: No additional user-provided focus was specified.

@linrrzqqq linrrzqqq force-pushed the udf-nondeterministic branch from f0a416b to 38714f5 Compare May 14, 2026 09:07
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review completed. I am requesting changes because blocking issues from the existing inline review context remain on the current head, and I did not add duplicate inline comments. In particular, the new UdfVolatilityTest still cannot pass: testAddProjectForRepeatedVolatileUdf asserts aliases.size() is 0 and then immediately dereferences aliases.get(0), which is the same test-failure area already raised in the existing thread.

Critical checkpoint conclusions:

  • Goal: the PR aims to add scalar UDF volatility semantics and protect optimizer rewrites. The implementation is not yet proven correct because existing blocking optimizer/test issues remain.
  • Scope: the change is mostly focused on FE UDF volatility and Nereids rewrite guards, but some previously raised paths still need cleanup.
  • Concurrency: no new direct concurrent data structure or locking path was identified in the reviewed FE changes.
  • Lifecycle/persistence: function metadata persistence is involved; existing review context already covers compatibility/SHOW CREATE replay concerns, and I avoided duplicating those comments.
  • Configuration: no new dynamic configuration item is added.
  • Compatibility: user-visible CREATE/SHOW FUNCTION syntax and persisted function metadata are changed; compatibility/replay must be resolved before merge.
  • Parallel code paths: the PR updates many UniqueFunction guard paths to volatile-expression guards, but existing review context already identifies remaining optimizer-path risks.
  • Tests: regression and unit tests were added, but the current unit test source still contains a failing assertion/dereference sequence. I attempted the targeted FE UT, but the runner lacks thirdparty/installed/bin/protoc, so the test command could not get past generated-source setup.
  • Observability: no additional observability requirement was identified for this FE optimizer/metadata change.
  • Transactions/data writes: no direct transaction write-path change was identified.
  • FE/BE variable passing: no new FE-BE execution variable was introduced.
  • Performance: no additional distinct performance issue was found beyond the optimizer rewrite correctness concerns already raised.

User focus: no additional user-provided review focus was specified.

@linrrzqqq linrrzqqq force-pushed the udf-nondeterministic branch from 38714f5 to 4771ff8 Compare May 14, 2026 09:46
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one remaining correctness issue in the current PR. The volatility guards now avoid emitting an unreplayable VOLATILITY property for UDTFs, but SHOW CREATE still serializes UDTFs as scalar CREATE FUNCTION statements, so existing Java/Python table functions are still not faithfully replayable.

Critical checkpoint conclusions:

  • Goal/test: the PR adds user-declared volatility for scalar Java/Python UDFs and broader Nereids rewrite guards; scalar volatility has coverage, but UDTF SHOW CREATE replay remains incomplete.
  • Scope: the implementation is mostly focused, but the UDTF serialization fix is partial.
  • Concurrency/lifecycle: no new concurrent state or special lifecycle risk identified in the reviewed paths.
  • Configuration/compatibility: no new config; persisted Function volatility defaults to IMMUTABLE for old metadata, but new scalar defaults are VOLATILE through create analysis.
  • Parallel paths: scalar SHOW CREATE, SHOW FUNCTIONS, and tests were updated; UDTF statement-kind serialization was missed.
  • Tests: added tests cover scalar volatility and absence of VOLATILITY for UDTFs, but not UDTF SHOW CREATE replay.
  • Observability/performance/transactions: no additional concerns found.

User focus: no additional user-provided review focus was supplied.

@linrrzqqq linrrzqqq force-pushed the udf-nondeterministic branch from 4771ff8 to acf1ca7 Compare May 14, 2026 16:29
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one additional replay-fidelity issue in the latest patch. Several earlier blocking concerns are already tracked in existing inline threads and are not duplicated here, including UDF/UDTF SHOW CREATE fidelity, volatility rewrite safety, and regression configuration/test failures.

Critical checkpoint conclusions: goal is to add UDF volatility metadata and protect optimizer rewrites; current code is closer but not fully complete because SHOW CREATE replay can still lose persisted lifecycle settings. Scope is mostly focused, but replay serialization needs to cover all persisted create-time properties it claims to preserve. Concurrency/locking changes are not involved. Lifecycle/persistence is involved for Function metadata; volatility is persisted, but replay SQL is still not equivalent for Python UDF/UDAF expiration settings. No new config items are added. Compatibility is partly addressed by defaulting missing volatility to IMMUTABLE, but existing threads still cover remaining behavior concerns. Parallel optimizer paths were updated broadly from unique-function checks to volatile-expression checks; no new distinct issue beyond already-threaded concerns found. Test coverage adds unit and regression coverage, but it misses replay of non-default expiration_time. Observability is not materially affected. Data-write/transaction paths are not involved. User focus: no additional user-provided review focus was specified.

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29197 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit acf1ca7942d68b7d197b23bf967fa64eaa314b7f, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17832	3913	3864	3864
q2	q3	10748	868	601	601
q4	4659	447	345	345
q5	7450	1322	1144	1144
q6	215	164	135	135
q7	916	931	752	752
q8	9706	1429	1218	1218
q9	6143	5363	5279	5279
q10	6296	2084	1796	1796
q11	472	265	255	255
q12	693	418	298	298
q13	18156	3332	2684	2684
q14	298	283	260	260
q15	q16	890	866	780	780
q17	953	1070	728	728
q18	6436	5563	5556	5556
q19	1193	1271	1077	1077
q20	510	397	260	260
q21	4601	2277	1861	1861
q22	422	353	304	304
Total cold run time: 98589 ms
Total hot run time: 29197 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4109	4044	4056	4044
q2	q3	4626	4746	4161	4161
q4	2087	2161	1349	1349
q5	4893	4943	5263	4943
q6	185	158	128	128
q7	2005	1786	1820	1786
q8	3463	3221	3183	3183
q9	8504	8432	8404	8404
q10	4474	4492	4219	4219
q11	587	435	407	407
q12	705	764	512	512
q13	3123	3617	2950	2950
q14	300	302	272	272
q15	q16	755	764	709	709
q17	1348	1318	1216	1216
q18	8107	7134	7028	7028
q19	1172	1133	1144	1133
q20	2219	2246	1958	1958
q21	6077	5464	4769	4769
q22	544	492	415	415
Total cold run time: 59283 ms
Total hot run time: 53586 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170621 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit acf1ca7942d68b7d197b23bf967fa64eaa314b7f, data reload: false

query5	4345	663	505	505
query6	341	218	194	194
query7	4215	569	298	298
query8	326	233	231	231
query9	8846	4086	4001	4001
query10	482	355	298	298
query11	6040	2379	2213	2213
query12	193	135	127	127
query13	1277	619	475	475
query14	6817	5350	5035	5035
query14_1	4346	4352	4313	4313
query15	210	207	183	183
query16	1032	467	355	355
query17	1323	767	646	646
query18	2776	496	354	354
query19	327	204	178	178
query20	140	132	133	132
query21	232	137	131	131
query22	13632	13451	13435	13435
query23	17119	16375	16597	16375
query23_1	16184	16281	16262	16262
query24	8180	1838	1439	1439
query24_1	1461	1427	1393	1393
query25	591	524	490	490
query26	1425	324	173	173
query27	2956	603	360	360
query28	4497	1969	2005	1969
query29	982	668	495	495
query30	296	234	189	189
query31	1107	1061	936	936
query32	86	68	69	68
query33	527	341	289	289
query34	1143	1151	649	649
query35	752	779	658	658
query36	1290	1347	1137	1137
query37	149	102	86	86
query38	3171	3122	3055	3055
query39	923	912	886	886
query39_1	877	866	887	866
query40	236	154	132	132
query41	64	60	58	58
query42	107	108	108	108
query43	321	319	278	278
query44	
query45	205	203	189	189
query46	1068	1166	694	694
query47	2292	2278	2231	2231
query48	396	418	311	311
query49	657	543	444	444
query50	745	295	222	222
query51	4312	4315	4262	4262
query52	103	106	93	93
query53	257	276	207	207
query54	304	273	262	262
query55	96	93	86	86
query56	311	319	306	306
query57	1452	1471	1349	1349
query58	332	273	264	264
query59	1551	1578	1378	1378
query60	328	321	314	314
query61	150	153	146	146
query62	670	616	555	555
query63	246	206	210	206
query64	2348	871	672	672
query65	
query66	1670	511	396	396
query67	30045	29969	29716	29716
query68	
query69	487	319	300	300
query70	987	999	931	931
query71	305	281	262	262
query72	3000	2693	2349	2349
query73	843	750	425	425
query74	5051	4898	4737	4737
query75	2767	2663	2329	2329
query76	2283	1105	759	759
query77	418	436	350	350
query78	12851	12935	12278	12278
query79	1531	1033	756	756
query80	1410	561	510	510
query81	522	276	247	247
query82	968	157	121	121
query83	356	269	240	240
query84	270	142	109	109
query85	921	536	440	440
query86	440	328	325	325
query87	3397	3332	3217	3217
query88	3558	2692	2658	2658
query89	436	383	333	333
query90	1947	179	174	174
query91	175	167	138	138
query92	74	75	74	74
query93	1083	965	561	561
query94	720	326	284	284
query95	678	459	349	349
query96	1031	739	329	329
query97	2682	2681	2561	2561
query98	236	229	226	226
query99	1081	1126	996	996
Total cold run time: 256309 ms
Total hot run time: 170621 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 70.69% (123/174) 🎉
Increment coverage report
Complete coverage report

Copy link
Copy Markdown
Contributor

@yujun777 yujun777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 52.56% (123/234) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants